String Editing and Longest Common
نویسنده
چکیده
The string editing problem for input strings x and y consists of transforming x into y by performing a series of weighted edit operations on x of overall minimum cost. An edit operation on x can be the deletion of a symbol from x, the insertion of a symbol in x or the substitution of a symbol of x with another symbol. String editing models a variety of problems arising in such diverse areas as text and speech processing, geology and, last but not least, molecular biology. Special cases of string editing include the longest common subsequence problem, local alignment and similarity searching in DNA and protein sequences, and approximate string searching. We describe serial and parallel algorithmic solutions for the problem and some of its basic variants.
منابع مشابه
Faster Parallel Computation of Edit Distances
Fast parallel algorithms are given for the longest common subsequence problem and the string editing problem with bounded weights. In the COMMON PRAM model, the algorithm for the longest common subse-quence problem takes time O(log m), where m is the length of the shorter string, while the algorithm for the string editing problem with bounded weights takes time O(max(log m; log n= log log n)), ...
متن کاملLongest Common Subsequences
The length of a longest common subsequence (LLCS) of two or more strings is a useful measure of their similarity. The LLCS of a pair of strings is related to thèedit distance', or number of mu-tations/errors/editing steps required in passing from one string to the other. In this talk, we explore some of the combinatorial properties of the sub-and super-sequence relations, survey various algorit...
متن کاملTime Warp Edit Distance with Stiffness Adjustment for Time Series Matching
In a way similar to the string-to-string correction problem we address time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of "edit operations" needed to transform one time series into another. To define the " edit operations " we use the paradigm of a graphical editing proce...
متن کاملA New Family of String Classifiers Based on Local Relatedness
This paper introduces a new family of string classifiers based on local relatedness. We use three types of local relatedness measurements, namely, longest common substrings (LCStr’s), longest common subsequences (LCSeq’s), and window-accumulated longest common subsequences (wLCSeq’s). We show that finding the optimal classier for given two sets of strings (the positive set and the negative set)...
متن کاملLinear Time Algorithm for the Generalised Longest Common Repeat Problem
Given a set of strings U = {T1, T2, . . . , T }, the longest common repeat problem is to find the longest common substring that appears at least twice in each string of U , considering direct, inverted, mirror as well as everted repeats. In this paper we define the generalised longest common repeat problem, where we can set the number of times that a repeat should appear in each string. We pres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996